Reproducible Research: Principles and Practice

Author

Martin Schweinberger

Welcome!

What You’ll Learn

By the end of this tutorial, you will understand:

Key concepts: Replication, reproduction, robustness, transparency
The crisis: Why reproducibility matters now more than ever
Practical strategies: How to make your research reproducible
Tools and techniques: Git, notebooks, documentation, DOIs
Best practices: Industry-standard workflows for transparency
Implementation: Step-by-step guides for immediate action

Essential for
Credible research
Collaborative science
Career advancement
Ethical responsibility

Who This Tutorial is For

All researchers who want to produce reliable, trustworthy science:

🔬 Active researchers - Publishing or planning to publish
🎓 Graduate students - Building research practices early
👥 Research teams - Establishing collaborative workflows
📊 Data analysts - Ensuring analysis transparency
👨‍🏫 Educators - Teaching reproducible methods

Assumes Basic research experience; no technical expertise required

Why Reproducibility Matters

The Reproducibility Crisis

A Crisis in Science

The numbers are alarming
- 70% of researchers failed to reproduce another’s experiments (Baker 2016)
- 50%+ couldn’t reproduce their own experiments
- Only 39% of psychology studies successfully replicated (Collaboration 2015)
- $28 billion/year wasted on irreproducible research (Freedman, Cockburn, and Simcoe 2015)

Impact on trust
- Public confidence in science eroded (McRae 2018)
- Funding agencies demanding transparency
- Journals requiring data/code sharing
- Careers damaged by retracted papers

Timeline of the Crisis

Late 1990s-Early 2000s
- Failed replications in medical research (Ioannidis 2005)
- Seminal psychology experiments don’t hold up
- Questions about research practices emerge

2010s
- Reproducibility Project: Psychology (2015) (Collaboration 2015)
- Only 36-47% of studies replicate successfully
- Crisis spreads to social sciences (Anderson et al. 2016)
- Nature survey reveals widespread concerns (Baker 2016)

2020s
- Increased focus on solutions
- Open science movement gains momentum
- Funder/journal mandates for transparency
- Tools and training widely available

Why Research Fails to Reproduce

Common causes

1. Methodological Issues (40%)
- Insufficient documentation
- Missing methodological details
- Underpowered studies
- Inappropriate statistical methods

2. Data Problems (35%)
- Data not available
- Data processing errors
- Outlier handling undocumented
- Raw data lost or modified

3. Computational Issues (25%)
- Software versions not recorded
- Code not shared or documented
- Random seeds not set
- Computing environment not described

4. Publication Bias
- Positive results preferentially published
- Negative results filed away
- P-hacking (trying multiple analyses)
- HARKing (Hypothesizing After Results Known)

Benefits of Reproducible Research

Why Invest in Reproducibility?

For Science
- Accelerates discovery (build on solid foundations)
- Prevents wasted resources
- Enables meta-analyses
- Increases public trust

For Your Career
- More citations (reproducible papers cited 2x more (Piwowar and Vision 2013))
- Collaboration opportunities
- Funder/journal requirements met
- Professional reputation enhanced

For You
- Future you will thank you (easy to revisit)
- Collaborators can contribute effectively
- Errors caught early
- Work builds cumulatively

Part 1: Core Concepts

Understanding Key Terms

These terms are often confused—let’s clarify

Replication

Definition

Replication = Repeating a study’s procedure with new data to see if findings hold.

Formula Same method + Different data → Similar results?

Characteristics
- Uses comparable (not identical) population
- Applies similar (not identical) procedures
- Tests robustness of findings
- Advances theory with new evidence

Example

Original study:  
- Does mindfulness reduce anxiety?  
- University students in California  
- 8-week mindfulness program  
- Result: 30% reduction in anxiety  
  
Replication study:  
- Same question and general method  
- University students in Germany  
- 8-week mindfulness program (translated)  
- Result: 25% reduction in anxiety  
→ Finding replicates! (similar effect)

Types of replication

1. Direct/Close Replication
- Procedures as identical as possible
- Different sample from same population
- Tests if result was sample-specific

2. Conceptual Replication
- Tests same underlying hypothesis
- Different procedures/measures
- Tests if result is method-specific

3. Constructive Replication
- Extends original study
- New conditions or populations
- Tests boundary conditions

Reproduction

Definition

Reproduction (Computational Replication) = Repeating analysis with same data and same method to verify results.

Formula Same method + Same data → Identical results

Also called
- Computational reproducibility
- Repeatability (McEnery and Brezina 2022)
- Analytic reproducibility

Characteristics
- Uses exact same dataset
- Applies exact same methods
- Should produce identical (or nearly identical) results
- Tests computational accuracy

Example

Original analysis:  
- Dataset: survey_responses.csv  
- Software: R version 4.0.0  
- Method: Linear regression  
- Result: β = 0.45, p = .003  
  
Reproduction attempt:  
- Same dataset: survey_responses.csv  
- Same software: R version 4.0.0  
- Same method: Linear regression  
- Result: β = 0.45, p = .003  
→ Analysis reproduces! ✓

Levels of reproducibility

1. Computational Reproducibility
- Same code + same data = same results
- Technical verification
- Minimal interpretation needed

2. Practical Reproducibility (Schweinberger 2024)
- Others can actually run your analysis
- Code works on different computers
- Dependencies clearly specified
- Time/effort to reproduce is reasonable

3. Formal/Theoretical Reproducibility
- Given perfect conditions, should be reproducible
- May not be practically achievable
- Requires extensive documentation

Robustness

Definition

Robustness = Results remain stable when different procedures are applied to same/similar data.

Formula Different methods + Same/similar data → Consistent conclusions?

Tests
- Sensitivity to analytical choices
- Impact of alternative specifications
- Effect of different assumptions
- Consistency across approaches

Example

Research question: Does education predict income?  
  
Method 1: Linear regression  
- Result: β = $5,000 per year of education  
  
Method 2: Quantile regression  
- Result: β = $4,800 per year of education  
  
Method 3: Matching analysis  
- Result: β = $5,200 per year of education  
  
→ Results robust across methods!  
  (Direction and magnitude consistent)

Why robustness matters
- Single analysis may have hidden assumptions
- Different approaches test different aspects
- Consistent results = stronger evidence
- Inconsistent results = need for investigation

Robustness checks
- Alternative statistical models
- Different variable specifications
- Subset analyses
- Sensitivity analyses
- Alternative data processing

Triangulation

Definition

Triangulation = Using multiple approaches (methods, data sources, theories) to address single research question.

Formula Multiple approaches → Converging evidence?

Types

1. Data Triangulation
- Multiple datasets
- Different time periods
- Different populations
- Different sources

2. Method Triangulation
- Quantitative + qualitative
- Multiple statistical approaches
- Experimental + observational

3. Investigator Triangulation
- Multiple researchers independently analyze
- Reduces individual bias
- Increases confidence

4. Theory Triangulation
- Multiple theoretical frameworks
- Different perspectives
- Comprehensive understanding

Example

Research question: Is social media use linked to depression?  
  
Approach 1: Survey data (self-report)  
- Result: Positive correlation (r = .30)  
  
Approach 2: Behavioral data (actual usage)  
- Result: Positive correlation (r = .25)  
  
Approach 3: Experimental intervention  
- Result: Reducing use decreases symptoms  
  
Approach 4: Qualitative interviews  
- Result: Users report negative mood after heavy use  
  
→ Multiple approaches converge on same conclusion!  
  (Increases confidence in finding)

Why triangulation?
- Each method has limitations
- Multiple approaches compensate
- Convergent evidence = stronger claims
- Addresses replication crisis limitations (Munafò and Davey Smith 2018)

Transparency

Definition

Transparency = Clear, comprehensive reporting of all aspects of research process.

Formula Complete information → Others can understand and evaluate

Dimensions

1. Design Transparency
- Research questions stated upfront
- Hypotheses pre-registered (ideally)
- Sampling strategy documented
- Power analysis reported

2. Data Transparency
- Data collection methods detailed
- Data processing steps documented
- Raw data shared (when ethical)
- Deviations from plan noted

3. Analysis Transparency
- All analyses reported (not just significant)
- Analysis code shared
- Software versions specified
- Decision-making explained

4. Results Transparency
- Full results (including null findings)
- Confidence intervals reported
- Effect sizes reported
- Alternative explanations considered

Levels of transparency

Level 1: Published Article
- Methods section with sufficient detail
- Clear results reporting
- Limitations acknowledged

Level 2: Supplementary Materials
- Extended methods
- Additional analyses
- Robustness checks

Level 3: Data Sharing
- Deidentified dataset
- Codebook/data dictionary
- Processing scripts

Level 4: Full Materials
- Raw data
- Complete code
- Computational environment
- Stimuli/materials

Level 5: Pre-registration
- Hypotheses registered before data collection
- Analysis plan pre-specified
- Deviations transparently reported

Relationships Between Concepts

                    TRANSPARENCY  
                    (Foundation)  
                         ↓  
        ┌────────────────┼────────────────┐  
        ↓                ↓                ↓  
  REPRODUCTION      REPLICATION       ROBUSTNESS  
  (Same data)     (New data)      (Different methods)  
        ↓                ↓                ↓  
        └────────────────┼────────────────┘  
                         ↓  
                  TRIANGULATION  
              (Multiple approaches)  
                         ↓  
                RELIABLE KNOWLEDGE

How they work together
1. Transparency enables all others
2. Reproduction verifies computational accuracy
3. Replication tests generalizability
4. Robustness confirms not method-dependent
5. Triangulation provides strongest evidence

Part 2: The Reproducibility Spectrum

Levels of Reproducibility

Not all research needs same level—match to goals

Level 0: Not Reproducible

Characteristics
- No data or code available
- Insufficient methodological detail
- Results can’t be verified

When acceptable
- Never (for published research)

Risk
- Claims can’t be evaluated
- Errors undetectable
- Science doesn’t advance

Level 1: Reproducible Publication

Characteristics
- Detailed methods section
- Complete statistical reporting
- Supplementary materials
- Data availability statement

Enables
- Understanding what was done
- Critical evaluation
- Conceptual replication

Good for
- Theoretical papers
- Reviews
- Qualitative research (with restrictions)

Level 2: Reproducible Analysis

Characteristics
- Data publicly available (or on request)
- Analysis code shared
- Codebook provided
- Basic documentation

Enables
- Verification of results
- Alternative analyses
- Extension of work

Good for
- Quantitative research
- Standard analyses
- Published datasets

Requirements
- Data sharing agreement
- Code with comments
- README file

Level 3: Fully Reproducible

Characteristics
- Complete workflow documented
- Version-controlled code
- Computational environment specified
- Continuous integration

Enables
- Push-button reproduction
- Exact result replication
- Long-term reproducibility

Good for
- Computational research
- Complex analyses
- High-stakes findings

Requirements
- Docker/virtual environment
- Dependency management (renv, conda)
- Automated workflows
- Comprehensive documentation

Level 4: Reproducible Science Ecosystem

Characteristics
- Pre-registration
- Registered reports
- Open materials
- Open peer review

Enables
- Prevention of p-hacking
- Null results published
- Complete transparency
- Cumulative science

Good for
- Experimental research
- Hypothesis testing
- Confirmatory analyses

Requirements
- Pre-registration platform
- Open Science Framework
- Transparent reporting
- Commitment to openness

Choosing Your Level

Decision Guide

Minimum (Level 2) recommended for
- All quantitative research
- Computational analyses
- Claims based on data

Higher levels (3-4) essential for
- High-impact findings
- Policy-relevant research
- Contested areas
- Computational methods papers

Factors to consider
- Ethical constraints (sensitive data)
- Copyright restrictions
- Resource availability
- Field norms
- Funder requirements

Part 3: Practical Strategies

1. Project Organization

Standard Folder Structure

Use templates for consistency

ProjectName/  
├── README.md  
├── data/  
│   ├── raw/              ← Never edit!  
│   ├── processed/  
│   └── metadata/  
├── code/  
│   ├── 01_clean.R  
│   ├── 02_analyze.R  
│   └── 03_visualize.R  
├── output/  
│   ├── figures/  
│   ├── tables/  
│   └── reports/  
├── docs/  
│   ├── manuscript/  
│   ├── presentations/  
│   └── notes/  
└── environment/  
    ├── renv.lock  
    └── Dockerfile

Benefits
- Anyone can navigate
- Automated workflows possible
- Reduces errors
- Portable across projects

File Naming Conventions

Formula

YYYY-MM-DD_project_description_version.extension

Example progression

2024-01-15_analysis_script_v1.R  
2024-01-20_analysis_script_v2.R  
2024-01-25_analysis_script_final.R

Avoid

❌ finalFINAL.R  
❌ use_this_one.R  
❌ analysis (1).R

2. Documentation

The Bus Factor

Bus Factor = Number of team members who must be unavailable (hit by a bus) for project to fail.

Most research Bus Factor = 1 (YOU!)

Solution Document everything!

Essential Documentation

README File

Template

# Project Title  
  
## Overview  
Brief description of research question and approach.  
  
## Repository Structure  
- `data/`: All datasets  
  - `raw/`: Original data (DO NOT EDIT)  
  - `processed/`: Cleaned data  
- `code/`: Analysis scripts (run in order)  
- `output/`: Results  
  
## Reproducing Analysis  
1. Install R 4.3.0+  
2. Install required packages: `renv::restore()`  
3. Run scripts in order:  
   - `code/01_clean.R`  
   - `code/02_analyze.R`  
   - `code/03_visualize.R`  
  
## Data  
- Source: [describe origin]  
- Collection dates: YYYY-MM-DD to YYYY-MM-DD  
- Sample: N = XXX  
- See `data/metadata/codebook.csv` for variable descriptions  
  
## Software  
- R version: 4.3.0  
- Key packages: tidyverse (2.0.0), lme4 (1.1-30)  
- See `renv.lock` for complete environment  
  
## Citation  
[Your citation here]  
  
## Contact  
[Your email]  
  
## License  
[e.g., CC-BY 4.0]

Codebooks

Purpose Explain every variable in dataset

Template

# Variable: participant_id  
- **Type**: Character  
- **Description**: Unique identifier for each participant  
- **Format**: P### (e.g., P001, P002)  
- **Range**: P001 to P150  
- **Missing**: No missing values  
  
# Variable: age  
- **Type**: Integer  
- **Description**: Participant age in complete years  
- **Range**: 18 to 75  
- **Missing**: -99 = refused to answer  
- **Notes**: Age at time of participation  
  
# Variable: condition  
- **Type**: Categorical  
- **Description**: Experimental condition assignment  
- **Values**:   
  - 1 = Control  
  - 2 = Treatment A  
  - 3 = Treatment B  
- **Missing**: No missing values  
- **Notes**: Random assignment, stratified by age group

Analysis Logs

Track every decision

# Analysis Log - ProjectName  
  
## 2024-01-15: Data Cleaning  
**Who**: MS  
**What**: Initial cleaning of survey data  
**Changes**:  
- Removed 12 duplicate entries (same participant_id)  
- Excluded 5 participants who didn't complete survey  
- Recoded -999 to NA for missing values  
- Created age_group variable  
**Files**:  
- Input: data/raw/survey_raw.csv  
- Output: data/processed/survey_cleaned.csv  
- Script: code/01_clean.R  
  
## 2024-01-20: Outlier Analysis  
**Who**: MS  
**What**: Examined outliers in reaction time data  
**Decision**: Removed 8 RTs > 3 SD (following  
 preregistration)  
**Rationale**: Likely indicates inattention  
**Files**:  
- Input: data/processed/survey_cleaned.csv  
- Output: data/processed/survey_final.csv  
- Script: code/02_outliers.R  
- Visualization: output/figures/outlier_plot.png  
  
## 2024-02-01: Main Analysis  
**Who**: MS  
**What**: Linear mixed effects model  
**Model**: RT ~ Condition + (1|Participant)  
**Result**: No significant effect of condition, β = 12.5,   
  p = .234  
**Notes**: Result differs from prediction - will explore in   
  robustness checks  
**Script**: code/03_main_analysis.R

3. Version Control with Git

Why Git for Research?

Git tracks
- Every change to every file
- Who made each change
- When changes occurred
- Why changes were made (commit messages)

Benefits
- Never lose work (complete history)
- Experiment freely (can always revert)
- Collaborate seamlessly
- Document evolution
- Satisfy reproducibility requirements

Git Basics for Researchers

Core workflow

# 1. Initialize repository  
git init  
  
# 2. Make changes to files  
  
# 3. Stage changes  
git add analysis.R  
  
# 4. Commit with descriptive message  
git commit -m "Add demographic analysis"  
  
# 5. Push to GitHub (if using remote)  
git push origin main

Writing Good Commit Messages

Formula

[Action verb] [what you did]  
  
Optional: Longer explanation of why

Good examples

✓ "Add demographic variables to model"  
✓ "Fix calculation error in summary statistics"  
✓ "Remove outliers based on preregistered criteria"  
✓ "Update figure labels for clarity"

Bad examples

✗ "stuff"  
✗ "changes"  
✗ "fixed it"  
✗ "final version (really!)"

When to Commit

Commit frequently
- After completing a logical unit of work
- Before starting something new
- Before major changes
- When something works
- At end of work session

Each commit = restore point

Rule of thumb 3-10 commits per day when actively working

Git with RStudio

RStudio has built-in Git integration!

Setup (one-time)
1. Tools → Project Options → Git/SVN
2. Version control system → Git
3. Restart RStudio

Daily workflow (visual interface)
1. Pull (get latest)
2. Work on files
3. Stage changes (checkboxes)
4. Write commit message
5. Commit
6. Push

No command line required!

`.gitignore` for Research

Create .gitignore file to exclude

# Large files  
*.pdf  
*.zip  
data/raw/*.csv  
  
# Sensitive data  
*_identifiable.csv  
private/  
  
# Temporary files  
.Rhistory  
.RData  
*.log  
  
# Output (can regenerate)  
output/figures/*.png  
output/tables/*.csv  
  
# System files  
.DS_Store  
Thumbs.db

Share code, not necessarily all output

4. Computational Notebooks

What are Notebooks?

Integrate
- Code
- Output
- Narrative explanation
- Figures/tables

In single document!

R Markdown Example

---  
title: "Analysis of Survey Data"  
author: "Your Name"  
date: "2024-02-10"  
output: html_document  
---  
  
# Introduction  
  
This analysis examines the effect of mindfulness   
training on anxiety scores (N = 150).  
  
**Hypothesis**: Mindfulness training will reduce   
anxiety scores compared to control.  
  
# Setup  
  
::: {.cell}

```{.r .cell-code}
library(tidyverse)  
library(lme4)  
  
# Load data  
data <- read_csv("../data/processed/survey_clean.csv")  
  
# Set seed for reproducibility  
set.seed(42)  
```
:::
  
# Descriptive Statistics  
  
::: {.cell}

```{.r .cell-code}
data %>%  
  group_by(condition) %>%  
  summarise(  
    n = n(),  
    mean_anxiety = mean(anxiety_post),  
    sd_anxiety = sd(anxiety_post)  
  ) %>%  
  knitr::kable()  
```
:::
  
**Finding**: Treatment group shows lower mean   
anxiety (M = 32.1) than control (M = 45.3).  
  
# Main Analysis  
  
::: {.cell}

```{.r .cell-code}
model <- lmer(anxiety_post ~ condition +   
                anxiety_pre + (1|participant_id),   
              data = data)  
summary(model)  
```
:::
  
**Result**: Significant effect of condition,   
β = -13.2, t = 4.5, p < .001.  
  
# Visualization  
  
::: {.cell}

```{.r .cell-code}
ggplot(data, aes(x = condition, y = anxiety_post)) +  
  geom_boxplot() +  
  labs(y = "Post-treatment Anxiety",  
       x = "Condition") +  
  theme_minimal()  
```
:::
  
# Conclusion  
  
Mindfulness training significantly reduced anxiety   
compared to control condition. Effect size was   
moderate (Cohen's d = 0.65).

Notebook Benefits

Why Use Notebooks?

For reproducibility
- Complete analysis in one file
- Explanation with code
- Output embedded
- Renders to multiple formats (HTML, PDF, Word)

For collaboration
- Colleagues see reasoning
- Easy to review
- Self-contained

For publication
- Supplementary material
- Preprints
- Teaching materials
- Interactive tutorials

5. Managing Computational Environments

The Problem

Your code works on your computer, but…
- Different R version on collaborator’s machine
- Package versions updated, code breaks
- Results differ slightly due to dependencies

Solution Document and manage environment!

Using `renv` (R Environment Manager)

Setup (one-time per project)

# Install renv  
install.packages("renv")  
  
# Initialize for project  
renv::init()

This creates
- renv.lock - Complete package list with versions
- renv/ folder - Project-specific library

Daily workflow

# Install packages as normal  
install.packages("tidyverse")  
  
# Take snapshot of current packages  
renv::snapshot()  
  
# Collaborator restores exact environment  
renv::restore()

Benefits
- Exact package versions recorded
- Projects isolated (no conflicts)
- Easy to share environment
- Long-term reproducibility

Recording Session Info

At end of every analysis script/notebook

sessionInfo()

Output

R version 4.3.0 (2023-04-21)  
Platform: x86_64-apple-darwin20 (64-bit)  
Running under: macOS Big Sur 11.6  
  
attached packages:  
 [1] tidyverse_2.0.0  dplyr_1.1.2       
 [3] ggplot2_3.4.2    lme4_1.1-33       
  
loaded via namespace:  
 [1] Matrix_1.5-4   utf8_1.2.3      
...

Why
- Documents exact software environment
- Helps troubleshoot version issues
- Required for full reproducibility

Docker (Advanced)

For ultimate reproducibility

Docker = Complete computing environment in container

Benefits
- Includes OS, R version, all packages
- Runs identically anywhere
- Future-proof (environment frozen)

Dockerfile example

FROM rocker/tidyverse:4.3.0  
  
# Install additional packages  
RUN R -e "install.packages('lme4')"  
  
# Copy project files  
COPY . /home/rstudio/project  
  
# Set working directory  
WORKDIR /home/rstudio/project

When to use
- Complex dependencies
- Long-term preservation
- High-stakes reproducibility

6. Data Sharing and DOIs

Persistent Identifiers

DOI (Digital Object Identifier) = Permanent link to resource

Format

https://doi.org/10.1234/example.dataset

Why DOIs matter
- Link never breaks (websites change)
- Enables proper citation
- Tracks impact (metrics)
- Funder/journal requirement

Where to Get DOIs

For data

1. UQ Research Data Manager → UQ eSpace
- Free for UQ researchers
- Automatic DOI assignment
- Meets funder requirements
- https://research.uq.edu.au/rmbt/uqrdm

2. Open Science Framework (OSF)
- Free, open
- Integrates project management
- DOI for datasets, code, materials
- https://osf.io

3. Zenodo
- Free, open
- Large file support (50 GB per dataset)
- GitHub integration
- https://zenodo.org

4. Figshare
- Free for public data
- Good visualizations
- https://figshare.com

Data Citation

How to cite data

Author(s). (Year). Title of dataset [Data set].   
Repository Name. https://doi.org/10.xxxxx/xxxxx

Example

Smith, J., & Doe, M. (2024). Mindfulness and anxiety   
survey data [Data set]. Open Science Framework.   
https://doi.org/10.17605/OSF.IO/XXXXX

Include in references just like articles!

7. Pre-registration

What is Pre-registration?

Pre-registration = Publicly stating research plan before collecting data

Includes
- Research question(s)
- Hypotheses
- Sample size and justification
- Data collection procedures
- Analysis plan
- Inference criteria

Why Pre-register?

Prevents Questionable Research Practices

Problems pre-registration solves

P-hacking
- Trying many analyses until something is significant
- Pre-registration commits to analysis plan

HARKing
- Hypothesizing After Results are Known
- Pre-registration timestamps hypotheses

Publication bias
- Positive results published, negative filed away
- Pre-registration enables tracking of null results

Selective reporting
- Only reporting favorable analyses
- Pre-registration commits to reporting plan

How to Pre-register

Platforms

1. OSF Registrations
- Free, open
- Various templates
- Embargoes available
- https://osf.io/registries

2. AsPredicted
- Simple, quick
- 9 questions
- Time-stamped
- https://aspredicted.org

3. ClinicalTrials.gov
- For clinical research
- Often required
- Public registry

Pre-registration Template

Minimal pre-registration

# Pre-registration: [Project Title]  
  
## Study Information  
  
### Research Question  
What specific question(s) will this study address?  
  
### Hypotheses  
What are your specific, testable predictions?  
  
H1: [Directional prediction]  
H2: [Directional prediction]  
  
## Design  
  
### Sample  
- Population: [describe]  
- Sample size: N = XXX  
- Justification: [power analysis]  
- Stopping rule: [when to stop collecting]  
  
### Measures  
- DV: [how measured]  
- IV: [how manipulated/measured]  
- Covariates: [what will be controlled]  
  
## Analysis Plan  
  
### Data Exclusion  
What data will be excluded and why?  
- Participants who [specific criteria]  
- Trials/items where [specific criteria]  
  
### Outliers  
How will outliers be defined and handled?  
- Definition: [e.g., >3 SD from mean]  
- Treatment: [e.g., exclude, winsorize]  
  
### Missing Data  
How will missing data be handled?  
- [e.g., listwise deletion, imputation]  
  
### Primary Analysis  
What is the main statistical test?  
- Model: [specify]  
- Inference: [e.g., α = .05, two-tailed]  
  
### Robustness Checks  
What additional analyses will test robustness?  
1. [Alternative model specification]  
2. [Sensitivity analysis]  
  
## Other  
  
### Exploratory Analyses  
What analyses are exploratory (not confirmatory)?  
  
### Timeline  
When will data collection begin and end?  
  
### Registration Date  
[Auto-filled by platform]

Registered Reports

Even stronger Journal reviews study design before data collection

Process
1. Submit protocol (intro, methods, analysis plan)
2. Peer review of design
3. In-principle acceptance (IPA)
4. Collect data and analyze as planned
5. Submit results
6. Published regardless of outcome

Benefits
- Prevents publication bias
- Improves study design
- Guarantees publication of null results
- Highest credibility

Journals offering registered reports
- https://cos.io/rr/

Part 4: The Reproducibility Checklist

Before Starting

Planning Phase

Project Setup
- [ ] Create standard folder structure
- [ ] Initialize Git repository
- [ ] Set up renv for package management
- [ ] Create README template
- [ ] Draft data management plan

Pre-registration (if appropriate)
- [ ] Formulate specific hypotheses
- [ ] Determine sample size (power analysis)
- [ ] Specify analysis plan
- [ ] Register on OSF/AsPredicted
- [ ] Time-stamp before data collection

During Research

Active Research Phase

Data Collection
- [ ] Document all procedures
- [ ] Store raw data separately (never edit!)
- [ ] Record all deviations from plan
- [ ] Maintain data collection log

Data Processing
- [ ] Comment all code thoroughly
- [ ] Use descriptive variable names
- [ ] Document all decisions
- [ ] Create processing log
- [ ] Set random seeds

Analysis
- [ ] Follow pre-registered plan (if applicable)
- [ ] Document exploratory analyses separately
- [ ] Run robustness checks
- [ ] Save all output
- [ ] Use notebooks for integration

Version Control
- [ ] Commit frequently (daily)
- [ ] Write descriptive commit messages
- [ ] Push to GitHub regularly
- [ ] Tag important versions

Documentation
- [ ] Update README as you go
- [ ] Maintain analysis log
- [ ] Create/update codebook
- [ ] Document software versions

Before Publication

Preparation Phase

Code Review
- [ ] Code runs from scratch
- [ ] No hard-coded paths (use relative)
- [ ] All dependencies documented
- [ ] Comments explain complex sections
- [ ] Random seeds set

Data Preparation
- [ ] Deidentify if necessary
- [ ] Create final clean dataset
- [ ] Write comprehensive codebook
- [ ] Check for errors/inconsistencies
- [ ] Document data sources

Materials
- [ ] Compile all materials (surveys, stimuli)
- [ ] Create materials documentation
- [ ] Check copyright/permissions
- [ ] Prepare for sharing

Computational Environment
- [ ] Run renv::snapshot()
- [ ] Document R version
- [ ] List package versions
- [ ] Consider Docker (if complex)

Documentation
- [ ] README complete and accurate
- [ ] All scripts commented
- [ ] Analysis notebook renders
- [ ] Codebook comprehensive

Repository
- [ ] Choose appropriate repository
- [ ] Obtain DOI
- [ ] Upload all materials
- [ ] Set appropriate license (CC-BY recommended)
- [ ] Test that others can access

Publication

Sharing Phase

Manuscript
- [ ] Include data/code availability statement
- [ ] Cite data with DOI
- [ ] Reference pre-registration (if applicable)
- [ ] Report deviations from plan
- [ ] Include supplementary materials

Data/Code Sharing
- [ ] Repository link in manuscript
- [ ] DOI in data availability statement
- [ ] Make repository public (on acceptance)
- [ ] Respond to data requests promptly

Follow-up
- [ ] Monitor citations of data
- [ ] Update if errors found
- [ ] Consider registered report for replication

Part 5: Troubleshooting

Common Challenges

“My code is messy/embarrassing”

Reality check
- All code starts messy
- Clean code is learned
- Shared code improves faster

Steps
1. Add comments explaining intent
2. Use consistent style (tidyverse, etc.)
3. Break into functions (DRY principle)
4. Test it works from scratch
5. Share anyway (helps others!)

Remember Working code > perfect code

“I don’t have time for all this”

Time investment
- Initial setup: 4-6 hours
- Ongoing: 30 min/week

Time savings
- Finding files: -2 hours/week
- Rerunning lost analyses: -10 hours/project
- Responding to reviews: -5 hours
- Future you will thank you!

Start small
- Week 1: Folder structure + README
- Week 2: Git basics
- Week 3: Better documentation
- Week 4: Notebooks

“My collaborators don’t care about reproducibility”

Strategies
- Lead by example
- Share benefits (citations, efficiency)
- Start with small wins
- Make it easy for them
- Emphasize funder requirements

You can control
- Your own practices
- What you share
- Standards you maintain

“My field doesn’t do this”

Response
- Be a leader!
- Standards are changing
- Funders/journals requiring it
- Early adopters benefit most

Field-specific variations OK
- Adapt to your context
- Focus on key principles
- Build incrementally

Resources

Tools

Version Control
- Git - Version control system
- GitHub - Code hosting and collaboration
- GitLab - Alternative to GitHub

Notebooks
- R Markdown - R notebooks
- Quarto - Next-gen R Markdown
- Jupyter - Python notebooks

Environment Management
- renv - R package management
- conda - Python environment manager
- Docker - Containerization

Data Repositories
- OSF - Open Science Framework
- Zenodo - General-purpose repository
- UQ RDM - UQ Research Data Manager

Pre-registration
- OSF Registries - Comprehensive
- AsPredicted - Quick and simple
- Registered Reports - Journal commitments

Learning Resources

Guides
- British Ecological Society Guide - Reproducible research guide
- The Turing Way - Comprehensive handbook
- Reproducible Research Workshop

Courses
- LADAL Reproducibility with R - Practical R tutorial
- Software Carpentry - Programming for researchers
- Data Carpentry - Data skills

Papers
- Baker (2016) - Nature survey
- Munafò and Davey Smith (2018) - Manifesto for reproducible science
- Nosek and Errington (2020) - Replication and reproducibility
- Goodman, Fanelli, and Ioannidis (2016) - Definitions
- Wilson et al. (2017) - Good enough practices

Communities

Organizations
- Center for Open Science - Promoting openness
- rOpenSci - R packages for reproducibility
- The Carpentries - Training community

Forums
- RStudio Community - R help
- Stack Overflow - Programming Q&A
- ReproducibiliTea - Journal clubs

Quick Reference

Reproducibility Workflow

Every new project

1. Create folder structure  
2. Initialize Git (git init)  
3. Create README  
4. Set up renv (renv::init())  
5. Consider pre-registration

Every analysis session

1. Pull latest (git pull)  
2. Work on code/analysis  
3. Commit frequently (git commit)  
4. Update documentation  
5. Push to remote (git push)

Before publication

1. Test code runs from scratch  
2. Document environment (sessionInfo())  
3. Create codebook  
4. Obtain DOI  
5. Upload to repository

Red Flags for Non-Reproducibility

Watch out for:
- ❌ No version control
- ❌ Data in emails
- ❌ Hard-coded file paths
- ❌ “It works on my machine”
- ❌ No documentation
- ❌ Manual data processing (undocumented)
- ❌ Code comments: “Magic numbers”
- ❌ Multiple “final” versions

Green Flags for Reproducibility

Look for:
- Git repository
- Standard folder structure
- README present
- Comprehensive documentation
- Commented code
- Notebook integrating analysis
- Environment documented
- Public repository with DOI

Citation & Session Info

Schweinberger, Martin. 2026. Reproducible Research: Principles and Practice. Brisbane: The Language Technology and Data Analysis Laboratory (LADAL). url: https://ladal.edu.au/tutorials/repro.html (Version 2026.02.10).

@manual{schweinberger2026repro,  
  author = {Schweinberger, Martin},  
  title = {Reproducible Research: Principles and Practice},  
  note = {https://ladal.edu.au/tutorials/repro.html},  
  year = {2026},  
  organization = {The Language Technology and Data Analysis Laboratory (LADAL)},  
  address = {Brisbane},  
  edition = {2026.02.10}  
}

Code

sessionInfo()

R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: Australia/Brisbane
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] gapminder_1.0.0  lubridate_1.9.4  forcats_1.0.0    stringr_1.5.1   
 [5] dplyr_1.2.0      purrr_1.0.4      readr_2.1.5      tibble_3.2.1    
 [9] ggplot2_3.5.1    tidyverse_2.0.0  tidyr_1.3.2      here_1.0.1      
[13] DT_0.33          kableExtra_1.4.0 knitr_1.51      

loaded via a namespace (and not attached):
 [1] gtable_0.3.6      jsonlite_1.9.0    compiler_4.4.2    renv_1.1.1       
 [5] tidyselect_1.2.1  xml2_1.3.6        systemfonts_1.2.1 scales_1.3.0     
 [9] yaml_2.3.10       fastmap_1.2.0     R6_2.6.1          generics_0.1.3   
[13] htmlwidgets_1.6.4 munsell_0.5.1     rprojroot_2.0.4   svglite_2.1.3    
[17] tzdb_0.4.0        pillar_1.10.1     rlang_1.1.7       stringi_1.8.4    
[21] xfun_0.56         timechange_0.3.0  viridisLite_0.4.2 cli_3.6.4        
[25] withr_3.0.2       magrittr_2.0.3    grid_4.4.2        digest_0.6.39    
[29] rstudioapi_0.17.1 hms_1.1.3         lifecycle_1.0.5   vctrs_0.7.1      
[33] evaluate_1.0.3    glue_1.8.0        colorspace_2.1-1  rmarkdown_2.30   
[37] tools_4.4.2       pkgconfig_2.0.3   htmltools_0.5.9

Back to top

Back to HOME

References

Anderson, C. J., S. Bahnik, M. Barnett-Cowan, F. A. Bosco, J. Chandler, C. R. Chartier, and N. Della Penna. 2016. “Response to Comment on "Estimating the Reproducibility of Psychological Science".” Science 351 (6277): 1037. https://doi.org/10.1126/science.aad9163.

Baker, Monya. 2016. “1,500 Scientists Lift the Lid on Reproducibility.” Nature Publishing Group UK London.

Collaboration, Open Science. 2015. “Estimating the Reproducibility of Psychological Science.” Science 349 (6251): aac4716. https://doi.org/10.1126/science.aac4716.

Freedman, Leonard P, Iain M Cockburn, and Timothy S Simcoe. 2015. “The Economics of Reproducibility in Preclinical Research.” PLoS Biology 13 (6): e1002165.

Goodman, S. N., D. Fanelli, and J. P. Ioannidis. 2016. “What Does Research Reproducibility Mean?” Science Translational Medicine 8 (341): 341ps12. https://doi.org/10.1126/scitranslmed.aaf5027.

Ioannidis, J. P. A. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2 (8): e124. https://doi.org/10.1371/journal.pmed.0020124.

McEnery, Tony, and Vaclav Brezina. 2022. Fundamental Principles of Corpus Linguistics. Cambridge University Press.

McRae, Mike. 2018. “Science’s ’Replication Crisis’ Has Reached Even the Most Respectable Journals, Report Shows.” https://www.sciencealert.com/replication-results-reproducibility-crisis-science-nature-journals.

Munafò, Marcus R., and George Davey Smith. 2018. “Robust Research Needs Many Lines of Evidence.” Nature 553 (7689): 399–401. https://doi.org/10.1038/d41586-018-01023-3.

Nosek, Brian A., and Timothy M. Errington. 2020. “What Is Replication?” PLoS Biology 18 (3): e3000691. https://doi.org/10.1371/journal.pbio.3000691.

Piwowar, Heather A, and Todd J Vision. 2013. “Data Reuse and the Open Data Citation Advantage.” PeerJ 1: e175.

Schweinberger, Martin. 2024. “Implications of the Replication Crisis for Corpus Linguistics – Some Suggestions to Improve Reproducibility.” In Broadening Horizons: Data-Intensive Approaches to English, edited by Mikko Laitinen and Paula Rautionaho. Cambridge University Press.

Wilson, Greg, Jennifer Bryan, Karen Cranston, Justin Kitzes, Lex Nederbragt, and Tracy K Teal. 2017. “Good Enough Practices in Scientific Computing.” PLoS Computational Biology 13 (6): e1005510.

Welcome!

Who This Tutorial is For

Why Reproducibility Matters

The Reproducibility Crisis

Timeline of the Crisis

Why Research Fails to Reproduce

Benefits of Reproducible Research

Part 1: Core Concepts

Understanding Key Terms

Replication

Reproduction

Robustness

Triangulation

Transparency

Relationships Between Concepts

Part 2: The Reproducibility Spectrum

Levels of Reproducibility

Level 0: Not Reproducible

Level 1: Reproducible Publication

Level 2: Reproducible Analysis

Level 3: Fully Reproducible

Level 4: Reproducible Science Ecosystem

Choosing Your Level

Part 3: Practical Strategies

1. Project Organization

Standard Folder Structure

File Naming Conventions

2. Documentation

The Bus Factor

Essential Documentation

README File

Codebooks

Analysis Logs

3. Version Control with Git

Why Git for Research?

Git Basics for Researchers

Writing Good Commit Messages

When to Commit

Git with RStudio

.gitignore for Research

4. Computational Notebooks

What are Notebooks?

R Markdown Example

Notebook Benefits

5. Managing Computational Environments

The Problem

Using renv (R Environment Manager)

Recording Session Info

Docker (Advanced)

6. Data Sharing and DOIs

Persistent Identifiers

Where to Get DOIs

What to Share

Data Citation

7. Pre-registration

What is Pre-registration?

Why Pre-register?

How to Pre-register

Pre-registration Template

Registered Reports

Part 4: The Reproducibility Checklist

Before Starting

During Research

Before Publication

Publication

Part 5: Troubleshooting

Common Challenges

“I can’t share my data (privacy/ethics)”

“My code is messy/embarrassing”

“I don’t have time for all this”

“My collaborators don’t care about reproducibility”

“My field doesn’t do this”

Resources

Tools

Learning Resources

Communities

Quick Reference

Reproducibility Workflow

Red Flags for Non-Reproducibility

Green Flags for Reproducibility

`.gitignore` for Research

Using `renv` (R Environment Manager)